{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Applying a Different Augmentation Strategy Per Class\n",
    "\n",
    "Let's say we wished to augment the MNIST dataset, but you wished to use a generator to supply a neural network with data. \n",
    "\n",
    "Ordinarily you could write a pipeline that would augment all the data, regardless of the class. However with MNIST you might want to have different pipelines for each of the 10 different classes. \n",
    "\n",
    "For example, it would make sense to flip images for the figure 8 both horizontally and vertically and still end up with feasible data. The figure 3 could be flipped vertically but not horizontally. Conversely, the figure 4 should not be flipped either horizontally or vertically. \n",
    "\n",
    "We can do this by creating 10 different pipelines, and adding or removing the appropriate operations from each pipeline as required.\n",
    "\n",
    "Augmentor does not support this natively, but it can be performed easily enough, and here we will learn how. \n",
    "\n",
    "First we import the required libraries:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Using Theano backend.\n",
      "Using gpu device 0: GeForce GTX TITAN X (CNMeM is enabled with initial size: 90.0% of memory, cuDNN not available)\n"
     ]
    }
   ],
   "source": [
    "import Augmentor\n",
    "import numpy as np\n",
    "import os\n",
    "import glob\n",
    "import random\n",
    "import collections\n",
    "\n",
    "from PIL import Image\n",
    "\n",
    "import keras\n",
    "from keras.models import Sequential\n",
    "from keras.layers import Dense, Dropout, Flatten\n",
    "from keras.layers import Conv2D, MaxPooling2D\n",
    "from keras.datasets import mnist\n",
    "\n",
    "random.seed(0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. Point to a Root Directory\n",
    "\n",
    "Your root directory must contain subdirectories, one for each class in your machine learning classification problem:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "root_directory = \"/home/marcus/Documents/mnist/train/*\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Scan for folders in the root directory\n",
    "\n",
    "We use `glob.glob()` to scan for all files in the `root_directory` and only choose those that are directories. These will be out classes:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Folders (classes) found: ['6', '0', '3', '4', '7', '5', '1', '8', '2', '9'] \n"
     ]
    }
   ],
   "source": [
    "folders = []\n",
    "for f in glob.glob(root_directory):\n",
    "    if os.path.isdir(f):\n",
    "        folders.append(os.path.abspath(f))\n",
    "\n",
    "print(\"Folders (classes) found: %s \" % [os.path.split(x)[1] for x in folders])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Create a pipeline for each class\n",
    "\n",
    "Now we create a pipeline object for each of the classes. MNIST consists of 10 digits, and each digit represents one class:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Folder /home/marcus/Documents/mnist/train/6:\n",
      "Initialised with 5918 image(s) found.\n",
      "Output directory set to /home/marcus/Documents/mnist/train/6/output.\n",
      "----------------------------\n",
      "\n",
      "Folder /home/marcus/Documents/mnist/train/0:\n",
      "Initialised with 5923 image(s) found.\n",
      "Output directory set to /home/marcus/Documents/mnist/train/0/output.\n",
      "----------------------------\n",
      "\n",
      "Folder /home/marcus/Documents/mnist/train/3:\n",
      "Initialised with 6131 image(s) found.\n",
      "Output directory set to /home/marcus/Documents/mnist/train/3/output.\n",
      "----------------------------\n",
      "\n",
      "Folder /home/marcus/Documents/mnist/train/4:\n",
      "Initialised with 5842 image(s) found.\n",
      "Output directory set to /home/marcus/Documents/mnist/train/4/output.\n",
      "----------------------------\n",
      "\n",
      "Folder /home/marcus/Documents/mnist/train/7:\n",
      "Initialised with 6265 image(s) found.\n",
      "Output directory set to /home/marcus/Documents/mnist/train/7/output.\n",
      "----------------------------\n",
      "\n",
      "Folder /home/marcus/Documents/mnist/train/5:\n",
      "Initialised with 5421 image(s) found.\n",
      "Output directory set to /home/marcus/Documents/mnist/train/5/output.\n",
      "----------------------------\n",
      "\n",
      "Folder /home/marcus/Documents/mnist/train/1:\n",
      "Initialised with 6742 image(s) found.\n",
      "Output directory set to /home/marcus/Documents/mnist/train/1/output.\n",
      "----------------------------\n",
      "\n",
      "Folder /home/marcus/Documents/mnist/train/8:\n",
      "Initialised with 5851 image(s) found.\n",
      "Output directory set to /home/marcus/Documents/mnist/train/8/output.\n",
      "----------------------------\n",
      "\n",
      "Folder /home/marcus/Documents/mnist/train/2:\n",
      "Initialised with 5958 image(s) found.\n",
      "Output directory set to /home/marcus/Documents/mnist/train/2/output.\n",
      "----------------------------\n",
      "\n",
      "Folder /home/marcus/Documents/mnist/train/9:\n",
      "Initialised with 5949 image(s) found.\n",
      "Output directory set to /home/marcus/Documents/mnist/train/9/output.\n",
      "----------------------------\n",
      "\n"
     ]
    }
   ],
   "source": [
    "pipelines = {}\n",
    "for folder in folders:\n",
    "    print(\"Folder %s:\" % (folder))\n",
    "    pipelines[os.path.split(folder)[1]] = (Augmentor.Pipeline(folder))\n",
    "    print(\"\\n----------------------------\\n\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can summarise what was scanned:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Class 1 has 6742 samples.\n",
      "Class 0 has 5923 samples.\n",
      "Class 3 has 6131 samples.\n",
      "Class 2 has 5958 samples.\n",
      "Class 5 has 5421 samples.\n",
      "Class 4 has 5842 samples.\n",
      "Class 7 has 6265 samples.\n",
      "Class 6 has 5918 samples.\n",
      "Class 9 has 5949 samples.\n",
      "Class 8 has 5851 samples.\n"
     ]
    }
   ],
   "source": [
    "for p in pipelines.values():\n",
    "    print(\"Class %s has %s samples.\" % (p.augmentor_images[0].class_label, len(p.augmentor_images)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Add operations to the pipelines\n",
    "\n",
    "Here we will add operations to each of the pipelines. Some operations will be applied to all pipelines, others only to some pipelines.\n",
    "\n",
    "Here we add a rotate operation to all pipelines (and hence will be applied to all digits):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "for pipeline in pipelines.values():\n",
    "    pipeline.rotate(probability=0.5, max_left_rotation=5, max_right_rotation=5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here we add some operations that we only want to apply to certain classes. The figure 8 can be flipped horizontally and vertically:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "pipelines[\"8\"].flip_top_bottom(probability=0.5)\n",
    "pipelines[\"8\"].flip_left_right(probability=0.5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "While the figure 3 can only be flipped vertically:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "pipelines[\"3\"].flip_top_bottom(probability=0.5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5. Define a class label / class integer map\n",
    "\n",
    "The classes will have string labels associated with them, depending on the name of each class's parent folder. Here you must map the names of each of your classes with the 0-based index (which must correspond to the test data of your dataset).\n",
    "\n",
    "In the case of MNIST this is quite easy, the samples for the digit 0 were stored in a folder 0 and have the text label 0, and so on:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "integer_labels = {'0': 0, \n",
    "                  '1': 1, \n",
    "                  '2': 2, \n",
    "                  '3': 3, \n",
    "                  '4': 4, \n",
    "                  '5': 5, \n",
    "                  '6': 6, \n",
    "                  '7': 7, \n",
    "                  '8': 8, \n",
    "                  '9': 9}"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 6. Define pipeline containers to store the pipelines and additional information\n",
    "\n",
    "Later we will need each pipeline's 0-based integer label as well as its categorical label (depending on the type of neural network you define):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "PipelineContainer = collections.namedtuple('PipelineContainer', \n",
    "                                           'label label_integer label_categorical pipeline generator')\n",
    "\n",
    "pipeline_containers = []\n",
    "\n",
    "for label, pipeline in pipelines.items():\n",
    "    label_categorical = np.zeros(len(pipelines), dtype=np.int)\n",
    "    label_categorical[integer_labels[label]] = 1\n",
    "    pipeline_containers.append(PipelineContainer(label, \n",
    "                                                 integer_labels[label], \n",
    "                                                 label_categorical, \n",
    "                                                 pipeline, \n",
    "                                                 pipeline.keras_generator(batch_size=1)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 7. Define a generator function\n",
    "\n",
    "Neural networks in Keras can be supplied with a generator to supply training data. Because we have one generator for each pipeline, we need to create \"generator of generators\":"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "def multi_generator(pipeline_containers, batch_size):\n",
    "    while True:\n",
    "        X = []\n",
    "        y = []\n",
    "        for i in range(batch_size):\n",
    "            pipeline_container = random.choice(pipeline_containers)\n",
    "            image, _ = next(pipeline_container.generator)\n",
    "            image = image.reshape((28,28,1)) # Or (1, 28, 28) for channels_first, see Keras' docs.\n",
    "            X.append(image)\n",
    "            y.append(pipeline_container.label_categorical)  # Or label_integer if required by network\n",
    "        X = np.asarray(X)\n",
    "        y = np.asarray(y)\n",
    "        yield X, y"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 8. Create the generator object\n",
    "\n",
    "Create a generator, `g` to pass data randomly from each pipeline (and hence each class) to a neural network:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "batch_size = 128\n",
    "\n",
    "g = multi_generator(pipeline_containers=pipeline_containers, \n",
    "                    batch_size=batch_size)  # Here the batch size can be set to any value"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To generate a batch of 128 images and labels, at random from a random pipeline defined above, we can use the `next()` function:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "X, y = next(g)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can confirm that we are receiving images in batches of 128 and that the labels correspond to the images in each pipeline:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "128 images returned. 128 labels returned.\n"
     ]
    }
   ],
   "source": [
    "print(\"%s images returned. %s labels returned.\") % (np.shape(X)[0], len(y))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "We can use PIL to view the augmented images and cofirm the labels match (note that PIL requires images to be specified differently to how Keras expects data, hence some preprocessing of the data must be performed):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAABwAAAAcCAAAAABXZoBIAAABr0lEQVR4nJWSzUtUYRTGf+d675hf\nNM4wQmBhaiChMDIRLXLVInJRLdq56a8o+gdKiLCdgotKdKXUykUtE42ghIqiGqSoRck0XzaTc31n\n7mlxr3fuWC06i5f34cd7zuF5H1H+XVarVG2ef0ARcus5Qf8GgaWrcyWR/UYHaoqB+7v+1YrMUgCj\n7Z+f5qMzFalt5QHlw7aHMRGoSPXR3Y8CkC+YroGe6EupP7m1agAh5tEdb29CFZ7PvEmfAKBSxLOt\nEKrwbXZt8NIRADyIHe0MoWAerjE8igJeuYF6gQk2Kmze+9I2kgGovJv/QeknEkDQ5SynJkE8c+fm\nniW/dgKzbITyqx3Ongf3wY1Sd7LoUg4hVBrwesV99jZbyNyeX60lktpsm0yJbmQ15zY6picWDImu\nuhPCQxdeblWrwLXLGTtuE0/ZYVuVi7r4ye47PXbuGMQsBo9LCEUTV9Jfd1Mnkw4Uv++19ccjC6E9\n4+PGAVRco6p1/I0sQFBwQBFqhYbVGfju/0qQCgEq701/ujfiULQOD5Unzzj76kCCzOPrL1Q9X0hr\nqFX8YPhK/iPxrfUbQ4DLuKOlNzwAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<PIL.Image.Image image mode=L size=28x28 at 0x7F061D62D850>"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "image_index = 3  # Take image index 3 from the batch\n",
    "\n",
    "x_array = X[image_index]\n",
    "x_array = x_array.reshape((28,28))\n",
    "x_array = x_array * 255\n",
    "x_array = x_array.astype(np.uint8)\n",
    "Image.fromarray(x_array)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The label below should correspond to the image output above:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Image label: 4\n"
     ]
    }
   ],
   "source": [
    "print(\"Image label: %s\" % (np.nonzero(y[image_index])[0][0]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 9. Train a neural network with the generator\n",
    "\n",
    "Last, we train a neural network with the differing pipelines for each class.\n",
    "\n",
    "First we define a neural network:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "num_classes = len(pipelines)\n",
    "input_shape = (28, 28, 1)\n",
    "\n",
    "model = Sequential()\n",
    "model.add(Conv2D(32, kernel_size=(3, 3),\n",
    "                 activation='relu',\n",
    "                 input_shape=input_shape))\n",
    "model.add(Conv2D(64, (3, 3), activation='relu'))\n",
    "model.add(MaxPooling2D(pool_size=(2, 2)))\n",
    "model.add(Dropout(0.25))\n",
    "model.add(Flatten())\n",
    "model.add(Dense(128, activation='relu'))\n",
    "model.add(Dropout(0.5))\n",
    "model.add(Dense(num_classes, activation='softmax'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Once a network has been defined, you can compile it so that the model is ready to be trained with data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "model.compile(loss=keras.losses.categorical_crossentropy,\n",
    "              optimizer=keras.optimizers.Adadelta(),\n",
    "              metrics=['accuracy'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Using the same batch size as the generator above, we can begin to train the neural network:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 1/10\n",
      "390/390 [==============================] - 31s - loss: 0.4767 - acc: 0.8502    \n",
      "Epoch 2/10\n",
      "390/390 [==============================] - 31s - loss: 0.1364 - acc: 0.9595    \n",
      "Epoch 3/10\n",
      "390/390 [==============================] - 30s - loss: 0.1029 - acc: 0.9691    \n",
      "Epoch 4/10\n",
      "390/390 [==============================] - 30s - loss: 0.0879 - acc: 0.9740    \n",
      "Epoch 5/10\n",
      "390/390 [==============================] - 30s - loss: 0.0778 - acc: 0.9771    \n",
      "Epoch 6/10\n",
      "390/390 [==============================] - 30s - loss: 0.0689 - acc: 0.9784    \n",
      "Epoch 7/10\n",
      "390/390 [==============================] - 30s - loss: 0.0650 - acc: 0.9804    \n",
      "Epoch 8/10\n",
      "390/390 [==============================] - 31s - loss: 0.0632 - acc: 0.9808    \n",
      "Epoch 9/10\n",
      "390/390 [==============================] - 30s - loss: 0.0594 - acc: 0.9828    \n",
      "Epoch 10/10\n",
      "390/390 [==============================] - 30s - loss: 0.0531 - acc: 0.9845    \n"
     ]
    }
   ],
   "source": [
    "h = model.fit_generator(g, steps_per_epoch=50000/batch_size, epochs=10, verbose=1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Conclusion\n",
    "\n",
    "Certain tasks may require different augmentation strategies on a class-by-class basis. The procedure above allows you to do this using Augmentor and Keras."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "empirical",
   "language": "python",
   "name": "empirical"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
